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trained CNN model on MITBIH dataset for two classes 
(Normal, Abnormal) to classify ECG heartbeat. The training 
and testing results were (98.6)% and (99)% respectively, which 
are very good and promise. 
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Introduction 

The leading cause of the high death rate worldwide is heart disease. Heart disease represents 16% of 
wholly deaths from all causes, according to the world health organization. The number of heart disease fatalities 
has increased by more than (2) millions from year 2000, and in 2019 became (9) million [1]. The signal from 
ECG is a recording of the heart's bioelectrical activity [2]. By effective treatment, early diagnosis of cardiac 
illnesses (abnormalities) can lengthen life and improve life’s quality [3]. ECG is often utilized by cardiologists 
to evaluate heart health. The primary issue with manual ECG signal analysis is the difficulty in recognizing 
different waves and in the signal, which is a problem with many other time-series data. This task needs a lot of 
human effort and is susceptible to mistakes [4]. 
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ECG is a quick test that uses electrodes placed on the skin to assess the electrical activity of the heart. It 
displays the intensity and timing of the electrical impulses traveling through your heart, as well as whether the 
rhythm of your heartbeats is normal or abnormal. Numerous cardiac disorders, such as disturbed heart rhythm, 
and electrolyte imbalances, can lead to changes in normal ECG patterns. Physicians can identify cardiovascular 
problems via computer-aided ECG analysis [5]. ECG signals are the most reliable indicators of heart activity 
status. a typical ECG signal is made up of T wave and QRS, which together represent the most significant 
champers of the signal and are used to analyze the heart's status as well as other aspects of the signal. P wave 
represents the activity of the upper champers of the heart. Hence, any cardiac condition may be seen in these 
parts of the waveform, notably the QRS complex, which is identified by a shortening, broadening, or 
lengthening of the QRS complex [6-9]. 


A recurrent pattern of P, QRS, T represents the rhythmic depolarization and repolarization of the 
myocardium associated with the contraction of the atria and ventricles throughout every cardiac cycle. See Fig. 
1 [10]. 


The term artificial intelligence (AJ) is notoriously difficult to define. It is quite likely the most amazing 
and complicated generation of humans to date. Science's definition AI is a series of computational technologies 
inspired by how humans use their neurological systems to perceive, observe, comprehend, and act [11,12,13]. 
One of the major issues in the fields of AI and machine learning is deep learning (DL) [14]. 


sT 
Segment 


QT Interval 


Fig. 1. ECG signal waves 


As DL is a subset of ML and AI in terms of working domain, it is an AI function that simulates how the 
human brain deals with data [4,15]. 


As the amount of data grows, DL is less effective than traditional ML. To develop computational models, 
DL represents data abstractions utilizing numerous layers. In difference to ML, DL runs fast in testing even if 
it tokens a long time for training since the many parameters. Fig.2 compares the situation of DL with that of 
ML and AI [16]. 


The purpose of this system is to balance the proposed system's accuracy of heartbeat classification with 
the size of the model by performing experiments using MIT-BIH dataset. The proposed system successfully 
classify heartbeat using Deep learning networks with an accuracy of (99%) after we carried out the process of 
training the neural network using the mentioned dataset. 
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Fig. 2. DL relation to ML and AI 
Several strategies used by researchers to categorize ECG heartbeats will be reviewed. 


I. Fathail and D. Bhagilewe (2022) devised a method to digitize the ECG paper, recognizing R peaks, 
computing the average heart rate, and sending SMS in abnormality case. The system works by having users 
upload an ECG picture, then reduced in dimension, has its features extracted as digital signals, and is then saved 
ina CSV file utilizing MATLAB. The system then recovers the signals so that the raw signals may be processed 
further. The device uses the Python programming language to send SMS alerts to doctors if the heart rate is 
irregular using a technological platform called Twilio. They examined 10 ECG papers. The accuracy of the 
peak detection was approximately 90%, true positive was around 90%, false positive rate was up to 8% [1]. A. 
Faroog et al. (2021) developed a method that enables LabVIEW to classify the observed ECG waveform. The 
input ECG sensor signal is first gathered using the sensor system, then processed in LabVIEW to provide a 
classification. They presented a simulation built on LabVIEW that categorizes the ECG signal as healthy, 
unhealthy, or not specified. ML is used to train the categorization system (K-mean clustering). They treated 
three patients for a total of 14 days. An automated appointment can be made in 27.5 seconds through SMS in 
the event of a health issue [2]. 


A unique classifier two dimensional (CNN) was used by P. Seitanidis et al. (2022), and it was optimized 
for storage and computational complexity, making it appropriate for implementation on edge devices. 
According to the tests conducted on the MITBIH arrhythmia database, the suggested two dimensional CNN 
achieves accuracy of 95.3% [17]. A unique technique for detecting vital signs based on fall posture and chest 
discomfort was developed by H. Mohan et al. (2021) using an intelligence surveillance camera with Using 
single shot detectors Inception V2, MobileNet V2, and NVIDIA's Jetson Nano. 3000 indoor color photo files 
were processed using the proprietary RMS dataset and the Red, Blue, Green, and Depth dataset from Nanyang 
Technological University. After examining the measures, they came to an average accuracy and recall of 76.4% 
and 80%, respectively [18]. M. Sotorra (2019) created and built a data visualizer for the MITBIH dataset. He 
also used a deep autoencoder to compress the beats and a principal component analysis to minimize the data's 
dimensionality. The initial work is a convolutional autoencoder with ten neurons. The calculated loss is 22.8 
percent. The correlation coefficient between the input vector and the autoencoder result is, correspondingly, 
0.99, 0.89, 0.96, 0.93, and 0.92 for beats N, L, R, V, and A. Moreover, a 5-neuron autoencoder has been trained 
for even more compression. The loss outcomes are equal to 22.8%, and the correlation coefficients are, 
correspondingly, 0.95, 0.86, 0.83, 0.89, and 0.71 [19]. 


Z. Zhang and W. Yan preprocessed ECG database data using combination of median filter and bandstop 
filter to account for individual variations in ECG waveforms. A model is built by using deep neural network 
techniques to solve the issue of feature important variability in MIT-BIH dataset. The suggested multiorder 


© 2023, CAJMTCS | CENTRAL ASIAN STUDIES www.centralasianstudies.org ISSN: 2660-5309 | 48 


CENTRAL ASIAN JOURNAL OF MATHEMATICAL THEORY AND COMPUTER SCIENCES _ Vol: 04 Issue: 12 | Dec 2023 


classification method's average recognition rates with the enhanced wavelet features and improved RBP 
algorithm are 78.8% and 64.5%, respectively [20]. Learning the optimal ECG features from every heartbeat 
window using an auto-encoder convolution network, M. Ojha et al. (2021) built a one-dimensional CNN model. 
The Support Vector Machine classifier then used to recognize the four various forms of arrhythmic beats, 
including regular beats, using auto-encode characteristics. Tenfold cross-validation techniques are used to 
analyze the model's statistical performance, and the results show that the model has 98.84%accuracy, 
99.53 %average accuracy, 98.24% sensitivity, and 97.58% precision, respectively [21]. W. Ullah et al. (2021) 
used two different types of datasets. The MIT-BIH database, with 109446 ECG beats with a sampling frequency 
of 125 Hz, is one dataset. The classes N, S, V, F, and Q are included in the first dataset. The second dataset is 
the PTB ECG dataset. It has double classes. CNN, CNN+LSTM, and CNN+LSTM is used on the above 
datasets. Eighty percent of dataset for training, 20% for testing. Combining the techniques led to accuracy 
results of 99.12 percent for CNN, 99.3 percent for CNN + LSTM, and 99.29 percent for CNN+LSTM [10]. S. 
Irfan et al. (2022) proposed DL framework that combines several techniques by stacking related layers in every 
technique for creating a single, reliable model. Five classes of arrhythmias have been identified using the 
proposed methodology on two datasets. The suggested technique accuracy was 99.35% [22]. F. Ibrahem and 
M. Younes (2019) assessed the effectiveness of their method using a dataset containing 205,146 records. 
Classification techniques, Decision Trees, Random Forests and Gradient-Boosted Trees (GDB), constructed 
used on the MIT-BIH and Baseline MIT-BIH databases, the suggested approach is assessed and verified. their 
findings indicate that overall accuracy for binary classification measured by the GDB Tree algorithm and the 
random forest technique was 96.75%. With Random Forest, it was possible to attain a 98.03% accuracy rate for 
multi-class classification [23]. 


PROPOSED METHOD 


Deep CNN architecture is built and trained on MITBIH dataset with many parameters as well explained 
be later, then, the system tested for many ECG heartbeats. Fig. 3 shows a flowchart for the proposed system. 
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Fig. 3. The proposed system flowchart 


Easy approach for preprocessing ECG data and extracting beats proposed where ECG beats are inputs of 
it. These are the procedures for separating beats from an ECG signal (see Fig. 4). The recommended beat 
extraction approach is clear and useful for obtaining R-R intervals from signals. Any processing -such as 
filtering- or made any presumptions about the signal's morphology or spectrum are made, each extracted beat 
has the same duration, which is important for usage as inputs in the next processing steps. 
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Extracted ECG Beat 
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Fig. 4. An ECG heartbeat and the extracted beat from it. 


Dataset 


MIT-BIH Dataset used as data source for labeled ECG records. The dataset employed in the research is 
comprises of ECG recordings made at a 360Hz sample rate on 47 distinct people with 30-min raw ECG 
recordings, 25 of whom are female and 22 of whom are male. At least two cardiologists mark each beat as they 
hear it. The (109,446) rows and (188) columns of the MIT-BIH dataset from Kaggle were split into two excel 
files (MIT-BIH train.csv) and (MIT-BIH test.csv) were utilized to train our built CNN. (87554) rows and 
(21892) rows, or 80% and 20%, respectively, for the training set and validation set [22,24]. 


Algorithm 


An algorithm must be evaluated in order to comprehend its comparative properties. Empirical techniques 
may be used to characterize the main approaches of evaluation. Algorithms for learning tasks are used in the 
experimental assessment to examine how well they work in real-world situations. When analyzed, some models 
could produce good outcomes, while others might produce poor results. As a result, it is a great way to assess 
DL algorithms. Because the algorithm is the most common ones that yield high accuracy in classify heartbeats, 
we utilized them to evaluate the model's performance [8]. This work aims to make a system that performs a 
portion of the functions that monitors heartbeats. After performing CNN training process, the proposed system 
classifies heartbeats to (Normal/ Abnormal) using deep learning CNN, where the CNN is trained on MIT-BIH 
dataset. Figure 5 shows our model layers. 
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Fic. 5. Algorithm architecture 


EXPERIMENTAL SETUP 


The following system components were installed on a laptop: An Intel Core I7 CPU from the 6th 
generation, 8 GB of RAM, and an internal display with Intel HD graphics 4600 resolution (2 GB). 


Python Language and Parameters 


Python (3.9.7) programming language used with libraries (Pandas, Numpy, Keras, Torch, Matplotlib, 
Seaborn and Sklearn) for ECG classification for two classes (Normal, abnormal) with several hyper parameters 
are applied to the model. These are used to train the model, and the best results are retained for comparison and 
performance analysis of the model in the future. 


Precision 


It is defined as the proportion of truthfully predicted positive classes all items with a positive prediction, 
the precision show in equation (1) [17,[21]. 


_-«—~P , 
Precision= 7P+ FP (1) 


Recall 


It is the ratio of TP and total of ground truth positives, usually referred to as the sensitivity. The Recall 
show in equation (2) [17,25]. 


IP 
= 2 
Recall= - TEV (2) 
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F1-Score 


The conflict between accuracy and recall; these two parameters are the two essential components in 
calculating the Fl-score [26,27], which stated mathematically as follows: 


Twox IP 
F]= 


” 2IP+FP+FN @) 


Model Training 


Used the feature extractions to train the pre-trained Convolutional Neural Network (CNN) models[4,28], 
which leverages the MIT-BIH Dataset to import the weights of the trained model. To retrain the model for new 
classification tasks, the final completely connected layer with 1000 neurons was deleted and replaced with a 
fully connected layer with two neurons to categorize the necessary classes. The parameters of the frozen layers 
do not update during training, resulting in a fast rise in the model's training speed. In this case, Softmax was 
employed as an activation function. Moreover, insert innovative layers afterward. This is accomplished by 
setting the upper layers to False. The model's last three layers are changed to adapt to the unique categorization 
task: Average Pooling, Dense FC with ReLU activation function (AF), Dense FC with Softmax function, and 
classification output layers. A pooling layer is added to increase feature extraction (FE) [26] performance. 
Additional FC layers are introduced to achieve classification using features learned from a new input set. In 
CNN, the AF ReLU is often used. The input size is (188x 1), and each model has a summary to evaluate layers 
and feature maps for correctness. Utilized a learning rate of (le-4) and patch size classifiers (32) for a total of 
(50) epochs. The feature map is utilized as an input to a complete communication layer to get classification 
results. the original convolutional neural network model does not minimize overfitting. As a result, these models 
need image data optimization and parameter fine-tuning. 


The CNN designed include convolution layers, pooling layers, and fully connected layers. Forward 
propagation is the method to convert input data into output. The convolution layer performs feature extraction 
including the convolution operation and the activation function. A pooling layer employs the conventional 
downsampling procedure to reduce the in-plane dimensionality of the feature maps in order to add translation 
invariance to small shifts and distortions and to restrict the number of consequent learnable parameters. a max 
pooling with a (2x1) filter and two strides used. The output feature maps from the final convolution or pooling 
layer are typically flattened, or transformed into 1D array, and linked fully connected layers, where every input 
and every output are connected by a trainable weight. The characteristics produced by the convolution layers 
and the downsampling layers then mapped to the network's final outputs, like every class probability in 
classification, by a collection of fully connected layers, each of them is followed by activation function 
[25,28,29, 30]. 


The Training parameters used in CNN training are (Dropout=0.4, learning rate=1x10-3, 
optimizer=Adam, Batch size= 1024, steps/epoch= 10). The hardware used in the training is a computer with 
specifications shown in Table 1. 
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TABLE 1. Hardware Requirements 


Microprocessor intel Core-i7, 7" generation 
Microprocessor speed 2.1 

RAM 8G 

The Operating system Windows 64 bits 

Graphical processing unit Intel HD Graphics 620 internal 
Hard disk drive SSD 


Results and discussion 


The first ECG signal in the dataset and its accompanying label was displayed as shown in Fig.6 at the 
beginning of training. 


0 3 0 5 100 15 150 15 
Time (ms) 


Fig. 6. The first 5 ECG signals in the dataset and their labels 


After that, the dataset's data will be normalized, as shown in Fig. 7. The normalized data will then be 
saved to acsv file to be used later. 


fe (mv) 
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Fig. 7. Normalized data 


Figure 8 displays the smooth-lined training and validation losses. The two losses is closed to zero, these 
are perfect results for training and validation losses. 


Epochs 


Fig. 8. The training and validation losses 
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Figure 9 shows two samples for ECG signals for the two classes normal and abnormal heartbeats. The 
proposed system will classify the entered ECG signal to one of the classes. 


(b) 
Fig. 9. (a) Sample of Abnormal ECG signal and (b) Sample of normal ECG signal 


The final results of training in final epoch (50) (Fig. 10) shows that training losses were (0.0070), 
validation losses were (0.0896), validation accuracy (0.9861) and the best accuracy is (0.9866). This is very 
good accuracy and promise in ECG heartbeat classification. 


4 


Train Loss: 0.0070 Val Loss: 0.0893 Val pen 0.9859 
Epoch 49/50 

Adjusting learning rate of group 0 to 8.1000e-06. 

Train Loss: 0.0072 ValLoss: 0.0898 Val Accuracy: 0.9861 
Epoch 50/50 


Adjusting leaming rate of group 0 to 2.4300e-06. 
Train Loss: 0.0070 ValLoss: 0.0896 Val Accuracy: 0.9861 
Finished Training and the best accuracy is: 0.9866 


Fig. 10. Training results of CNN algorithm 


After training the proposed algorithm, weight file will be produced. The total parameters (254901) with 
fifty epochs takes about an hour for training, a part of the training process in Fig. 11. 
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Epoch 29: val_acc improved from 0.98755 to 0.98767, saving model to 
baseline _cnn_mitbih.h5 

2463/2463 - 96s - loss: 0.0278 - acc: 0.9911 - val_loss: 0.0474 - val_acc: 
0.9877 - lr: 1.0000e-05 - 96s/epoch - 39ms/step 

Epoch 30/50 


Epoch 30: val_acc did not improve from 0.98767 

2463/2463 - 97s - loss: 0.0281 - acc: 0.9910 - vwal_loss: 0.0475 - val_acc: 
0.9876 - lr: 1.0000e-05 - 97s/epoch - 39ms/step 

Epoch 31/50 


Epoch 31: val_ace did not improve from 0.98767 
2463/2463 - 98s - loss: 0.0278 - acc: 0.9911 - val_loss: 0.0476 - val_acc: 
0.9877 - lr: 1.0000e-05 - 98s/epoch - 4O0ms/step 


Fig. 11. Part of training process 


After training, the proposed system has been tested for accuracy and performance. The system was tested 
on (100) heartbeats to see if their heartbeats were normal or not, taken one heartbeat for every test. The results 
were (99%) on the (100) heartbeats. These are very good results and promised in artificial intelligence. The 
system can be used in the medical fields after making improvements and testing it on thousands of cases to 
ensure excellent results. 


Conclusions 


Person's life is greatly threatened by cardiovascular disorders, and treating them rely heavily on proper 
ECG analysis. It takes a time and money for a medical professional for manually analyze ECG readings. Thus, 
it is crucial to create an automated technique for recognizing arrhythmias. The MIT-BIH Dataset serves as the 
primary source of data for this study. 


In this paper, deep CNN for heartbeats classification is trained using CNN with (.csv) file as input, and 
with procedures optimized for speed and accuracy. Our system performs training losses (0.0070), validation 
losses (0.0896), validation accuracy (0.9861) and the best accuracy is (0.9866). The test accuracy for the (100) 
heartbeats were (99%). According to these results, the proposed method is efficient and capable of making 
predictions in both categories (normal, abnormal) with good and promising accuracy and performance. In 
addition, the proposed system architecture is simpler and less complex than the architectures mentioned in 
related works. 


We'll leverage mobile and cloud technologies in the future. Also, it is essential to create wearable 
technology that uses less energy. The procedures that have been put into place may be rebuilt to operate with a 
variety of classes, work can be designed for use in the present, and precision can be improved and increased 
continuously. Furthermore, the same categorization procedure may be utilized to several kinds of datasets, 
including stress and clinical datasets. 
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